Goto

Collaborating Authors

 recognition system


Disneyland Now Uses Face Recognition on Visitors

WIRED

Plus: The NSA tests Anthropic's Mythos Preview to find vulnerabilities, a Finnish teen is charged over the Scattered Spider hacking spree, and more. A gunman attempted to enter the White House Correspondents' Dinner in Washington, DC, last weekend, while President Donald Trump, Vice President JD Vance, and other administration officials were in attendance. Media reports and Trump himself quickly identified the suspected shooter as 31-year-old engineer and computer scientist Cole Tomas Allen. The California resident was arrested at the scene on Saturday and appeared Monday in the US District Court for the District of Columbia to face three federal charges: attempting to assassinate the president, transportation of a firearm in interstate commerce, and discharge of a firearm during a crime of violence. The authentication standards body known as the FIDO Alliance announced working groups this week along with Google and Mastercard to develop technical guardrails for validating and protecting transactions initiated by an AI agent .



Hierarchical Object Representation for Open-Ended Object Category Learning and Recognition

Neural Information Processing Systems

Most robots lack the ability to learn new objects from past experiences. To migrate a robot to a new environment one must often completely re-generate the knowledgebase that it is running with. Since in open-ended domains the set of categories to be learned is not predefined, it is not feasible to assume that one can pre-program all object categories required by robots. Therefore, autonomous robots must have the ability to continuously execute learning and recognition in a concurrent and interleaved fashion. This paper proposes an open-ended 3D object recognition system which concurrently learns both the object categories and the statistical features for encoding objects. In particular, we propose an extension of Latent Dirichlet Allocation to learn structural semantic features (i.e.



VoiceBlock: PrivacythroughReal-TimeAdversarial AttackswithAudio-to-AudioModels

Neural Information Processing Systems

As governments and corporations adopt deep learning systems to collect and analyze user-generated audio data, concerns about security and privacy naturally emerge in areas such as automatic speaker recognition. While audio adversarial examples offer one route to mislead or evade these invasive systems, they are typically crafted through time-intensive offline optimization, limiting their usefulness in streaming contexts. Inspired by architectures for audio-toaudio tasks such as denoising and speech enhancement, we propose a neural network model capable ofadversarially modifying auser'saudio stream inrealtime. Our model learns to apply a time-varying finite impulse response (FIR) filter to outgoing audio, allowing for effective and inconspicuous perturbations on a small fixed delay suitable for streaming tasks. We demonstrate our model is highly effective at de-identifying user speech from speaker recognition and able to transfer to an unseen recognition system. We conduct a perceptual study and find that our method produces perturbations significantly less perceptible than baseline anonymization methods, when controlling for effectiveness. Finally, we provide an implementation of our model capable of running in real-time on asingle CPU thread.



Face Reconstruction from Facial Templates by Learning Latent Space of a Generator Network

Neural Information Processing Systems

In this paper, we focus on the template inversion attack against face recognition systems and propose a new method to reconstruct face images from facial templates. Within a generative adversarial network (GAN)-based framework, we learn a mapping from facial templates to the intermediate latent space of a pre-trained face generation network, from which we can generate high-resolution realistic reconstructed face images. We show that our proposed method can be applied in whitebox and blackbox attacks against face recognition systems. Furthermore, we evaluate the transferability of our attack when the adversary uses the reconstructed face image to impersonate the underlying subject in an attack against another face recognition system. Considering the adversary's knowledge and the target face recognition system, we define five different attacks and evaluate the vulnerability of state-of-the-art face recognition systems. Our experiments show that our proposed method achieves high success attack rates in whitebox and blackbox scenarios. Furthermore, the reconstructed face images are transferable and can be used to enter target face recognition systems with a different feature extractor model. We also explore important areas in the reconstructed face images that can fool the target face recognition system.


EEG Emotion Recognition Through Deep Learning

arXiv.org Artificial Intelligence

An advanced emotion classification model was developed using a CNN-Transformer architecture for emotion recognition from EEG brain wave signals, effectively distinguishing among three emotional states, positive, neutral and negative. The model achieved a testing accuracy of 91%, outperforming traditional models such as SVM, DNN, and Logistic Regression. Training was conducted on a custom dataset created by merging data from SEED, SEED-FRA, and SEED-GER repositories, comprising 1,455 samples with EEG recordings labeled according to emotional states. The combined dataset represents one of the largest and most culturally diverse collections available. Additionally, the model allows for the reduction of the requirements of the EEG apparatus, by leveraging only 5 electrodes of the 62. This reduction demonstrates the feasibility of deploying a more affordable consumer-grade EEG headset, thereby enabling accessible, at-home use, while also requiring less computational power. This advancement sets the groundwork for future exploration into mood changes induced by media content consumption, an area that remains underresearched. Integration into medical, wellness, and home-health platforms could enable continuous, passive emotional monitoring, particularly beneficial in clinical or caregiving settings where traditional behavioral cues, such as facial expressions or vocal tone, are diminished, restricted, or difficult to interpret, thus potentially transforming mental health diagnostics and interventions...


Going Places: Place Recognition in Artificial and Natural Systems

arXiv.org Artificial Intelligence

Place recognition--the process of an animal, person or robot recognizing a familiar location in the world--has attracted significant attention across multiple disciplines. In animals, this capability has evolved over millions of years through sophisticated neural mechanisms: hippocampal place cells fire at specific spatial locations (1), entorhinal grid cells provide spatial coordinates through hexagonal firing patterns (2), while diverse species demonstrate remarkable navigation--from desert ants using celestial cues and visual panoramas (3) to migratory birds returning to precise breeding sites across hemispheric distances (4). Humans extend these biological foundations with unique cognitive abilities, recognizing places not only through sensory perception but also through semantic meaning, emotional associations, and cultural context--enabling us to identify familiar locations from descriptions, memories, or even fictional narratives (5). In artificial systems, place recognition underpins core robotics functions such as localization, mapping, and long-term autonomy, developing into a mature field that, while sometimes inspired by biological principles, often diverges significantly in implementation to optimize for computational efficiency and metric accuracy. As research has grown in the area, so too has a rich landscape of surveys and reviews that reflect the field's evolution and diversification.


Accurate online action and gesture recognition system using detectors and Deep SPD Siamese Networks

arXiv.org Artificial Intelligence

Human activity recognition is an important research topic in pattern recognition field. It has been the subject of many studies in the past two decades because of its importance in numerous areas such as security, health, daily activity, energy consumption and robotics. Recently, some works on the recognition of hand gestures or human actions from skeletal data are based on the modeling of the skeleton's movement as manifold-based representation and proposed deep neural networks on this structure [1, 2, 3]. These approaches demonstrated their potential in the processing of skeletal data. Most of them are applied on offline human action recognition which is useful in time-limited tasks. However, in many applications, simply recognizing a single gesture in a given segmented sequence is not enough, especially in monitoring systems and virtual-reality devices which need to detect human movements moment by moment in continuous videos. In these online recognition systems, it is important to detect the existence of an action as early as possible after its beginning. It is also essential to determine the nature of the movement within a sequence of frames, without having information about the number of gestures present within the video, their starting times or their durations, unlike the segmented action recognition. In this paper, we propose to use a manifold-based model in order to build an online motion recognition system that detects and identifies different human activities in unsegmented skeletal sequences.